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Abstract 

The capacity of a discrete-time memoryless channel, in which successive symbols fade indepen- 
dently, and where the channel state information (CSI) is neither available at the transmitter nor at the 
receiver, is considered at low SNR. We derive a closed form expression of the optimal capacity-achieving 
input distribution at low signal-to-noise ratio (SNR) and give the exact capacity of a non-coherent channel 
' at low SNR. The derived relations allow to better understanding the capacity of non-coherent channels 

at low SNR and bring an analytical answer to the peculiar behavior of the optimal input distribution 
observed in a previous work by Abou Faycal, Trott and Shamai. Then, we compute the non-coherence 
penalty and give a more precise characterization of the sub-linear term in SNR. Finally, in order to better 
understand how the optimal input varies with SNR, upper and lower bounds on the capacity-achieving 
input are given. 

Index Terms 

Capacity, non-coherent fading channels, energy efficiency. 



I. INTRODUCTION 

In wireless communication, tlie channel estimation at the receiver is not often possible due, 
for instance, to the high mobility of the sender or the receiver or both. Therefore, achieving 



reliable communication over fading channels where the channel state information (CSI) is 
available neither at the transmitter nor at the receiver, is of a particular interest. Establishing 
the performance limits, in terms of channel capacity, error probability, etc., in such a non- 
coherent scenario has recently motivated extensive works (see for example [1], [2]). When CSI 
is available at the receiver, the channel capacity, commonly known as the coherent capacity has 
been studied by Ericson [3] for a Single Input Single Output (SISO) channel and recently by many 
other authors for a Multiple Input Multiple Output (MIMO) channel [4] [5]. Conversely, when 
CSI is not available at both ends, computing the channel capacity, known as the non-coherent 
capacity, as well as computing the optimal input distribution achieving this capacity, for both 
SISO and MIMO channels, is a rather tedious task [6] [7]. The main difficulty in computing the 
non-coherent capacity relies on the fact that the capacity-achieving input distribution is discrete 
with a finite number of mass points, where one of them is located at the origin. The number of 
these mass points increases with the signal-to-noise ratio (SNR). Since no bound on the number 
of mass points with respect to SNR is actually available, it is very difficult to find closed form 
expressions for both the achievable capacity and the optimal input distribution for all SNR values. 
Fortunately, numerical computation of the capacity and the optimal input distribution has been 
made possible using the Khun-Tucker condition which is a necessary and sufficient condition 
for optimality, for of a SISO channel [6] and for a MIMO channel [7]. 

Earlier in 1999, using a block fading channel, Marzetta and Hochwald have obtained the 
structure of the optimal input, with explicit calculations for the special case of a SISO channel 
at high SNR values or with a large coherence time [8]. The non-coherent capacity was also 
computed as a function of the number of transmit and receive antennas as well as the coherence 
time at high SNR in [9]. At a low SNR regime, it was also shown in [9] that to a first order of 
magnitude of the SNR, there is no capacity penalty for not knowing the channel at the receiver 
which is not the case at the high SNR regime. It has been well established previously that at low 
SNR, just like in an additive white Gaussian noise (AWGN) channel, the capacity of a fading 
channel varies linearly with the SNR regardless of whether or not the CSI is available at the 
receiver [10], [11]. Recently, this power efficiency at a low SNR regime or equivalently at a 
large channel bandwidth has motivated work towards a better understanding of the non-coherent 
capacity at a low SNR regime [1], [13], [14] for both SISO and MIMO channels using several 
fading models. 



In this paper, we analyze the capacity of a discrete time non-coherent memoryless Rayleigh 
fading SISO channel at low SNR. The main contributions of this paper are: 

1) Derivation of an analytical closed form of the channel mutual information at low SNR, 
which may also be considered as a lower bound on the channel mutual information for an 
arbitrary SNR value. 

2) Derivation of a fundamental relation between the capacity-achieving input distribution and 
the SNR value, from which an exact capacity expression is deduced at low SNR. 

3) Derivation of novel upper and lower bounds on the non-zero mass point location of the 
optimal input, which allow to deduce lower and upper bounds respectively on the non- 
coherent capacity at low SNR. 

The paper is organized as follows. Section HI] presents the system model. In section Hill we 
derive a closed form expression of the channel mutual information at low SNR which is also a 
lower bound on the channel mutual information at all SNR values. The optimal input distribution 
as well as the non-coherent capacity are presented in Section |IVl Numerical results are reported 
in Section |V] and Section |Vl] concludes the paper. 

II. CHANNEL MODEL 
We consider a discrete-time memoryless Rayleigh-fading channel given by: 

r{l) = h{l)s{l) +w{l), / = 1,2,3,... (1) 

where / is the discrete-time index, s{l) is the channel input, r(/) is the channel output, h{l) 
is the fading coefficient and w{l) is an additive noise. More specifically, h(l) and w{l) are 
independent complex circular Gaussian random variables with mean zero and variances cr^ and 
(7^, respectively. The input s{l) is subject to an average power constraint, that is £'[|s(Z)p] < P, 
where E[.] indicates the expected value. It is assumed that the channel state information is 
available neither at the transmitter nor at the receiver. However, even though the exact values of 
h{l) and w{l) are not known, their statistics are, at both ends. 

Model ([T]) appears for example during the decomposition of a wideband channel into parallel 
noninteracting channels, or when a narrow-band signal is hopped rapidly over a large set of 
frequencies, one symbol per hop [1]. 



Since the channel defined in is stationary and memoryless, the capacity achieving statistics 
of the input s(l) are also memoryless, independent and identically distributed (i.i.d). Therefore, 
for simplicity we may drop the time index / in ([T]). Consequently, the distribution of the channel 
output r conditioned on the input s can be obtained after averaging out the random fading 
coefficient h, yielding: 
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Noting that in Q, the conditional output distribution depends only on the squared magnitudes 
|s|2 and |r|2, we will no longer be concerned with complex quantities but only with their squared 
magnitudes. Conditioned on the input, |r|2 is chi-square distributed with two degrees of freedom: 
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Normalizing to unit variance, let y = \r\'^/a'^ and let x = |s|o-/iO-^. Then ([3]) may be written 
more conveniently as: 
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with the average power constraint Elx"^] < a, where a = Pal/a"^ is the SNR per symbol time. 



III. THE CHANNEL MUTUAL INFORMATION 
For the channel the mutual information is given by [12]: 

I{x;y) = 

The capacity of channel (Hj) is the supremum 
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C = sup I{x; y) 

E[x'^]<a 



(5) 



(6) 



over all input distributions that meet the constraint power. The existence and uniqueness of such 
an input distribution was established in [6]. More specifically, the optimal input distribution for 
channel dH) is discrete with a finite number of mass points, where one of them is necessarily 
null. That is, the capacity Q is expressed by 

fy\xXy\^i) 
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(7) 



where = < xi < 0:2 . . . < xn-i are the mass point locations and where po,Pi ■ ■ ■ ,Pn-i 
their probabilities respectively. This optimization problem is very difficult since the number 



of discrete mass points, the optimum probabilities and their locations are unknown. In [6], 
numerical evaluation of the capacity and the optimum input distribution was given using the 
Khun-Tucker condition which is necessary and sufficient for optimality. The authors have found 
empirically that two mass points are optimal for low SNR and that the number of mass points 
increases monotonically with SNR. Many other papers have used these results in order to further 
understand the non-coherent capacity and the optimal input distribution behavior as the SNR 
approaches zero [13], [14]. 

Since we focus on the low SNR regime, we may use in (|7]) a discrete input distribution 
with two mass points, where one of them is null, to obtain the optimal capacity at low SNR. 
Furthermore, this on-off signaling also provides a lower bound on the non-coherent capacity for 
all SNR values. Clearly, using computer simulation, it was shown in [6] that on-off signaling 
provides a tight lower bound on the capacity for the SNR values considered. That is, a lower 
bound on the capacity may be expressed by: 

Clb= max lLB{x;y), (8) 

E[x'^]<a 

where Ilb{x; y) is a lower bound on the channel mutual information /(x; y) given by: 

1 /"OO 

and the average constraint power becomes: pixl < a. Note that the optimization problem in 
dH]) is less complex than in (|7]) since we deal with only two unknowns pi ND xi. Furthermore, 
it is proven below that further simplifications can be obtained, using the fact that Ilb{xi,Pi) 
is monotonically increasing in xi and thus the problem at hand may be reduced to a simpler 
maximization problem without constraint. We summarize this result in lemma [TJ 

Lemma 1: The optimal capacity at low SNR and a lower bound on it for all SNR values is 
given by: 

Clb = max /iB(xi,a), (10) 
where lLB{xi^a) is the channel mutual information for a given mass point location xi and a 
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given SNR value a. Furthermore, Ilb{xi, a) may be written as: 
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where 2-Fi(-, ■, ■, ■) is the Gauss hypergeometric function. 

Proof: For convenience, the proof is presented in Appendix HI 



In Lemma [H the existence of a maximum for a given SNR value a is guaranteed by the 
continuity of Ii^b[^\-,Q) and the fact that it is bounded with respect to x\ over the interval 
[^a, oo[. This can be readily seen in Fig. [H where we have plotted the lower bound a) for 

different values of a. As can be seen in Fig. [H Ilb{xi, a) has a maximum for the 3 SNR regimes. 
The existence of such a maximum is also rigorously established in Appendix HI Clearly, as was 
discussed in Appendix H the maximization (flOl) is reduced to solving the equation -£^1 13(^1, a) 
for a given SNR value a. Ideally, an analytical solution would provide an insight as to how the 
non-coherent capacity and the optimal input distribution vary with the SNR. However, solving 
such an equation for arbitrary SNR values is very ambitious since it involves an analytical solution 
to a transcendental equations. Nevertheless, it is of interest to focus on the low SNR regime 
to get the benefit of some advantageous simplifications in order to elucidate the non-coherent 
capacity behavior at low SNR. 



In this section, we will use Lemma \T\ to derive a fundamental analytical relation between the 
optimal input distribution at a low SNR regime and the particular SNR value a. We show in 
Theorem [T] that this fundamental relation holds up to an order of a strictly less than 2. As is 
shown below, the derived relation is very useful since it allows computing the optimal input 
distribution for a given SNR value a while providing a rigorous characterization as to how the 
non zero mass point locations and their probabilities vary with a. Moreover, the derived relation 
may be used to compute the exact non-coherent capacity at low SNR values. 



IV. NON-COHERENT CAPACITY AT LOW SNR 



A. A fundamental relation between the optimal input distribution and the SNR 

We present the fundamental relation between the optimal input distribution and the SNR value 
in the following Theorem: 



Theorem 1: At a low SNR value a, the optimal input probability distribution for an order of 
magnitude of a strictly less than 2, is given by: 

xi with probability Pi = 
Ux) = { (12) 

with probability pq = 1 — pi, 



where xi is the solution of the equation: 
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Furthermore, the non-coherent channel capacity is given by: 
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C(a, xi) = a - a ■ ^ t - a^^ ^ ^ (14) 

x\ 1 + 

Proof: For convenience, the proof is presented in Appendix [Hi ■ 



= 0. 
(13) 



Clearly, (1131) is also a transcendental equation, for which determining an analytical solution 
is a very tedious task. Although it is very involved to derive an analytical solution of (fT3l) in 
the form of x\ = f{a), it is of interest from an engineering point of view, to resolve (fT3l) 
numerically and obtain the optimal Xi for a given SNR value a. One may then get the value of 
the non-coherent capacity by replacing in (fT4)) the obtained value of xi. Moreover, (fT3l) provides 
some insight on the behavior of xi as a tends toward zero. For example, using (fT3l) . one may 
determine the limit of xi as a tends toward zero. To see this, let M be this limit and let us 
assume that M is finite. From Appendix UIl we know that for the optimal input distribution, the 
non-zero mass point location xi is greater than one. Thus, its limit as a tends toward zero is 
greater or equal than one M > 1. Then, taking the limits on both sides of (fT3l) as a goes to zero 
yields: 

- (1 + M2)ln(l + M2) = 0. (15) 



That is, if M is finite, it would be equal to zero, the unique solution to (fTSi) . but this is impossible 
since M > 1. Hence, consistently with [6], [13], limxi = oo . Furthermore, we have found that 
(fT3] ) may be written in a more convenient way as: 
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with k = —1 if a < ao and k = elsewhere, and where W{-,-) is the Lambert function, with 
ip{x) given by: 

sin(^)(-x2 + ln(l + a;2) +a;Mn(l + a;2)) /-yrcotf^) l\ 
V{x) = ^ ^ ^ ^-exp ^ + 1+ . (17) 

TTX"^ \ X'' I 

Also, ao is the solution of (fT3l) for Xi — Xq, where xq is the root of the equation (p{x) = —-. 
The number — ^ comes out in our analysis from the fact that it is the unique point shared by 
the principal branch of the Lambert function W{0, x) and the branch with k = —1, W{—l,x). 
That is W{0, — ^) = — i). This guarantees the continuity of a in (fT6l) for all xi values. 

Numerically, we have found that ao = 0.0582 and xq = V3. 93388. Hence, (fT6l ) may also be 
viewed as a fundamental relation between the optimal input distribution and a for discrete-time 
non-coherent memoryless Rayleigh fading channels at low SNR. On the other hand, (fT6l) provides 
the global answer as to how the non-zero mass point location of the optimal on-off signaling 
and the SNR are linked together. For this purpose, a simple analysis of (fT6l ) has been done and 
some important results are recapitulated in the following corollary. 

Corollary 1: At low SNR, we have: 

1) For all a < a^, qq = 0.0582, a is an decreasing function with respect to xi and for all 
a > ao, a is an increasing function of Xi. 

2) For all a, Xi > Xq, where Xq = V3. 93388. 

3) lim a = 0. 

Corollary \T\ agrees with [6] where it was shown using computer simulation that the non-zero 
mass point location passes through a minimum before moving upward. However, by specifying 
the edge point (xq, ao). Corollary [T] gives a more precise characterization concerning this peculiar 
behavior of the non-zero mass point locations. Furthermore, Corollary [T] also refines the lower 



bound on xi, a;i > 1 and derives xq as an improved lower bound on the non-zero mass point 
location at low SNR. Moreover, from (fT6l) . we may write: 



In (a) +xl = xlW{k,!f{xi)) + vrcot (^) + In (x^) + ln(l + a;i) - 1. (18) 
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'x\ 

It is then easy to check that the right hand side (RHS) of (fTSi) is a decreasing function of xi for 
xi < xo, which yields an upper bound on xi. 

x?<-ln(a)+eo, (19) 

where = In {<^o) + which is again consistent with the upper bound derived in [13]. Note 
that the upper bound (fT9l ) is valid for all a < whereas the upper bound provided in [13] holds 
for a <^ aQ for which is negligible. On the other hand, combining ([T9l) and the lower bound 
on xi provided in Corollary \T\ one may obtain: 

a'^xl < d'xl < a° (^o - In (a) ) . (20) 

for all a > 0. That is: 

lim(a°x2)=0, (21) 

which means that a" tends toward zero faster than xl does toward infinity. This result may also 
be used to gain further insight on the capacity behavior at low SNR. For instance, from (fT4l) . 
we may write the non-coherent capacity as: 

C{a) = a + o{a), (22) 

where o{a) = —a ■ ^ —a- ^ i+V^^^"" — ' meaning that the non-coherent capacity 

varies linearly with a at low SNR and hence non-coherent communication at low SNR may be 
qualified as energy efficient communication. 

B. Energy efficiency and non-coherence penalty 

In general, the capacity of a channel including a Gaussian channel and a Rayleigh channel 
varies linearly at low SNR [13]. The difference between these channels in terms of capacity can 
only be explained by the sub-linear term o(a) in (|22l) . The sub-linear term has been defined in 
[13] as: 

A(a) :=a-C(a). (23) 



At low SNR, the sub-linear term A (a) is also related to the energy-efficiency, let En be the 
transmitted energy in Joules per information nat, then we have: 

^ • C{a) = a. (24) 
+ (25, 



Using (|23l) . we can write: 



where the approximation holds if is sufficiently small. Note that if 

A(a) 



a 

then from (|23l) and (|25l) . we have respectively: 



0, (26) 



C(a) ^ a (27) 

^ ~ 1, (28) 

which implies that the highest energy efficiency of -1.59 (dB) per information bit could be theo- 
retically achieved. For a Gaussian channel and a fading channel under the coherent assumption, 
the sub-linear terms are respectively given by [13]: 

AAWGNia) = ^a^ + o{a^) (29) 

Acoherentia) = ^E[\\h\\y^ + o{a') (30) 

For a non-coherent Rayleigh fading channel, the sub-linear term can be computed using (fT4l) : 

Infl + x?) 1+4, ^^sc (^) (^3^)"? 
A(a) = a ■ ^ t + a ^1 ^ ^ f '^ . (31) 

Note that at very low SNR and following (|3TI) . converges to zero making the non-coherent 
Rayleigh channel also energy efficient. However, as SNR increases, the convergence of to 
zero is slower than ^^^^^^"^ and :^£2^i£i£2«M i^jg could be seen from (f2T]) indicating that xi 
converges slower to infinity than a does to zero. To illustrate this, as an example, let us calculate 
the value of for an SNR value a = —30dB. Following (|3T1) . we can write: 



Solving (fT6l) for a = —30dB with respect to xl yields: xl ~ 4.96815. Then, substituting this 
value in we obtain ^ 49%. Note that for AWGN and coherent Rayleigh fading 

channels, ^^^g^^"^ and ^'^"'-erentia) ^^le same order of magnitude than the SNR value in 

this case. It takes a lower SNR for non-coherent communication to achieve the same energy 
efficient as AWGN and coherent Rayleigh fading channels. 

In the range of SNR values of interest, we may define the non-coherence penalty per SNR 
as: 

C^oherentia) - C{a) _ (33) 

a 

where Ccoherent Is the channel capacity under coherent assumption. Now, from [13], we can write 

Ccoherent 'iS. 

Ccoherentia) = o + 0(a) = + o{a^~"), (34) 

for any 1 > « > 0. Recalling that the non-coherent capacity in (fT4l) was obtained using series 
decomposition to an order strictly smaller than 2, then combining (fT4)) and (|34l) . we derive the 
exact non-coherence penalty per SNR up to this order: 

Ccoherentja) ' C (a) _ Ccoherent - C _ In (1 + X^) ^ (#) 

„ 2 +0.1" 1i2 (-'-') 

^coherent -^1 J- ~r X-^ 

Now using (I2TI) . dividing both sides of (1351 ) by a", (a > 0) and taking the limit as a tends to 
zero yields: 

Ccoherentia) - C (a) > 0^+", (36) 

where means: 

,. Ccoherentia) — C(a) 

lim — = oo. (37) 

Inequality (l36l) indicates that not only the non-coherent capacity is much greater than as was 
established in [1], but more precisely, it is much greater than a^+" since a^+" ^ a^, 1 > a > 0. 
Again, this result is in full agreement with [13]. 

In this subsection, we have discussed exact closed forms of the optimal input distribution and 
the non-coherent capacity based on the fundamental relation ([T3]) or equivalently (fT6l) . However, 
one may be interested in deriving simpler lower and upper bounds on these quantities in order 
to better understand how they vary with the SNR value a. This is discussed next. 



C. Upper and lower bounds on the non-coherent capacity 

Considering (fT6l ). since we are interested in the low SNR regime, we assume for simplicity 
that a < oq. Thus the Lambert function in (fT6l) is the branch with k = —1, that is W{—l,x). 
A lower bound on the non-coherent capacity is easily obtained by combining (fT9l ) and (fT4l) and 
will be referred to as CLB(a). We now derive the lower bound on the optimal non-zero mass 
point location and the upper bound on the non-coherent capacity in Theorem [2l 

Theorem 2: At low SNR values a, a lower bound on the optimal non-zero mass point location 
is given by: 

Xl.LB = I ^ (38) 



where y = y 1 + In i. Furthermore, an upper bound on the non-coherent capacity can be obtained 
from (O as: 

CuB{a) = C{a,Xi^LB) (39) 
Proof: For convenience, the proof is presented in Appendix [nil ■ 



V. Numerical Results and Discussion 

The curves in Fig. [21 show respectively, the non-zero mass point location of the capacity- 
achieving input distribution xi obtained using maximization (flOl ). and the one obtained using 
relation (fT3l) or equivalently (fT6l) . As can be seen from Fig.[2l the two curves are undistinguishable 
at low SNR, confirming that (flTI) is exact at low SNR. As the SNR increases, a small discrepancy 
between the two curves starts to appear. This is expected since (fT6l) holds for up to an order of 
magnitude strictly smaller than 2 and thus for small SNR values, (but not smaller than about 
2.10^^), a discrepancy may appear. Nevertheless, even for an SNR greater than 2.10^^, the curve 
obtained using (fT6l) is very instructive especially as it follows the same shape as the one obtained 
by simulation results. An interesting future work would be to use (flTI) in order to understand 
why a new mass point should appear as the SNR increases. It should be mentioned that the 
discrepancy observed in Fig. |2] may be rendered as small as desired using high order series 
expansion. However, the analysis would be unrewardingly too complex. 



Figure [3] depicts the non-coherent capacity curves. Again, the curve obtained by computer sim- 
ulation and the one obtained using (fT4l) are undistinguishable. More interestingly, the discrepancy 
observed at not very low SNR values in Fig.[2]has vanished, implying that the capacity is not very 
sensitive to the non-zero mass point location. Also shown in Fig. [3] is the linear approximation 
C{a) = a, which is an upper bound on the capacity. As can be noticed in Fig. [3l the linear 
approximation follows the same shape as the exact non-coherent capacity curves at low SNR 
and becomes quite loose for SNR values greater than 10^^. This implies that the sub-linear term 
defined in (l23l) is much more important at these SNR values. This can be seen in Fig. |4] where 
we have plotted the non-coherence penalty percentage given by (|35l) . Figure |4] confirms that 
there is no substantial gain in the channel knowledge in a capacity sense at very low SNR, thus 
indicating that non-coherent communication is almost as power-efficient as AWGN and coherent 
communications. As the SNR increases, a non-coherence penalty begins to appear reaching up 
to 70%. 

The derived upper and lower bounds on the non zero mass point locations given respectively 
by (fT9l) and (l38l) as well as well as the bounds derived in [13] are plotted in Fig. [5] along with 
the exact curves at low SNR. As can be seen in Fig. [51 the upper bound in [13], albeit tighter 
than ([T9l ). crosses the exact curves at about 2.10"^. At these not so low SNR values, the derived 
bound in [13] is no longer an upper bound, consistently with our discussion in Subsection IIV-AI 
On the other hand, the lower bound (1381) is tighter than the one derived in [13] for all SNR 
values. 

VL Conclusion 

In this paper, we have addressed the analysis of the capacity of discrete-time non-coherent 
memoryless Rayleigh fading channels at low SNR. We have computed explicitly the channel 
mutual information at low SNR which is also a lower bound on the channel mutual information, 
albeit not necessarily at low SNR values. 

Using the derived expression of the channel mutual information, we have been able to provide 
a fundamental relation between the non-zero mass point location of the capacity-achieving input 
distribution and the SNR. This fundamental relation brings the complete answer about how 
the optimal input distribution varies with the power constraint at low SNR. It also provides 
an analytical explanation on what was previously observed through computer simulation in [6] 



about the peculiar behavior of the non-zero mass point location at low SNR values. The exact 
non-coherent capacity has been derived and insights on the capacity behavior which can be 
gained through functional analysis has been shown. 

In order to better understand how the non-zero mass point location varies with the SNR, we 
have also derived lower and upper bounds which have been compared to recently derived bounds. 
The newly derived lower bound is tighter for all SNR values of interest, whereas somewhat looser, 
the upper bound was shown to hold for larger SNR values. 

Appendix I 
Proof of lemma [H 

For convenience, we will use f{x) instead of fxix) to denote the probability density function 
of the random variable x at the value x. We first prove that Ilb{x; y) is a strictly monotonically 
increasing function with respect to XiQ Differentiating ^ with respect to Xi yields 

|^/.B(..pO=P.f ^/fe|..)ln(^)% (1.40) 
Differentiating (g]), we obtain: 

/(i/ki) = 77^^ - (1 + ^D] f(y\^) (1-41) 



dxi (1 + Xl) 

Substituting (11.411) in (11.401) yields: 

^;,,(.,pO . [, _ (1 , ,„ a42, 

Let g{y) be defined as g{y) = In (^ ^^fly^^ ^- Now, we need the following lemma. 

Lemma 2: Let f{y) be a probability density function with mean m. If g{y) is a strictly 
monotonically increasing function then 



{y-m)f{y)g{y)>Q (1.43) 
Proof: The proof follows along similar lines as Lemma 1 in [6]. ■ 
To apply Lemma [2l it is sufficient to note that 

f{y) 



pi +po{l + exp 



l + xf 



(1.44) 



'Note that the technic used here to prove that lLB{x\y) is strictly monotonically increasing function with respect to xi 
follows along the same lines as the technic used to establish that the optimal input distribution has necessarily a mass point at 
zero in [6], albeit the two technics have strictly different objectives 



is strictly decreasing with respect to y because the exponent of the exponential function is 
negative, therefore ^-^j^^y- is strictly increasing and so is g{y). Finally, using the fact that {l + x\) 
is the mean of f{y\xi) and applying Lemma |2] to (11.421) . we obtain: 

3^/LB(a;i,Pi) >0, (1.45) 
ox I 

which means that Ilb{xi,Pi) is strictly increasing with respect to a;i. Consequently, the average 
power constraint holds with equality. That is E[x'^] = pix\ = a. Hence ([8]) is equivalent to: 

Clb = maxlLB{xi,pi) 

Xl>y^ (1.46) 

Pixf = a. 

Next, we prove the existence of the maximum in (11.461) . Clearly, /lb(xi,Pi) is now a function 
of Xi and a since pixf = a. That xi > ^/a follows automatically from the fact that pi < I. On 
the other hand, Ilb{xi,Pi) in ^ is positive-definite and continue with respect to xi and pi and 
thus so is Ilb{xi., a) for a given SNR value a. Moreover Ilb{xi, a) is upper-bounded over the 
interval [i/a, oo [ otherwise, one would have, for some SNR value, say a° : 

Ve >0, 3 X? > I /iB(a;?,a°) > e. (1.47) 

But this statement also means that the channel mutual information-an upper bound on Ilb{xi, a°)- 
is unbounded for a° which contradicts the fact that the capacity exists for all SNR values as 
proven in [6]. Hence, lLB{xi,a) is necessarily upper-bounded. Furthermore, the continuity of 
Ilb{xi, a) over [y/a, oo[ implies that the upper-bound is either achieved at a finite value xi or at 
oo. The last case is however impossible. To see this, it is sufficient to observe that for a given a, as 
xi goes to infinity, pi tends toward zero. Thus following dH), lim a) = /^^(oo, 0) = 0, 

and consequently lLB{xi,a) = for all xi E [y/a,oo[ which is impossible since the discrete 
input distribution x and the output y are dependent. That is, the upper bound is achieved at a finite 
value xi and this proves the existence of the maximum in (11.461) . Moreover, since the maximum 
is not at the borders of [y/a, oo[, we necessarily have at the maximum ^iLBi^i, a) = 0. 



Finally, in order to prove (fTTI) , we directly compute the lower bound Ilb{^i,Pi) from ^\ 

poo pco 

Ilb{xuPi) = Po f{y\Q)ln{f{y\0))dy-po /(y|0) In (/(y)) 



h 



— 



poo poo 

+ pj f{y\xr)\n{f{y\x,))-p, f{y\xr)\n{f{y)) (1.48) 
Jo Jo 



h 



h 



Ii and Is may be easily computed: 

POO 

h =Po / In {e~y)dy = -po = 1 - pi 



(1.49) 



Pi 



1 + a^i 
-pi {l + \n{l + xl)) 



l + xf 



e ^+""1 dy 



(1.50) 



Po e y\n{poe y + 



Pi 



l + xf 



/•CXD poo / 

J poe-y In {poe~y) dy + J poe~^ In M + 



Pi 



Po{l + xl] 



eV 1+-!/ c/y (1.51) 



/21 



/22 



/21 can be easily computed: 



I21 = Po [In (po) - 1] 
pi — 

poa (l-pi)c 



(1.52) 



In order to compute I22, let a = 1 + and P = ^ = d^^^^ - Thus, I22 may be written: 



'22 



Poa 
a — 1 

Poa 
a — 1 



1 — a 



a 



r^\n{l + (3t) 



1 — a 



/3 



J 1 



a J I 1 + pt 



dt 



The integral on the RHS of (|I.53I) may be computed as [15]: 



t a — 1 

-dt 



/i 1 + /?t a/5 
Substituting (|I.54I) in (11.531) . we obtain: 



i^i 1,1 + 



2 + 



a — 1 a — 1 P 



h2 = Po 



In {l+p) + 



a — 1 



a 



Fi 1, 1 + 2 + 



a — 1 a — 1 P 



and thus combining (|I.51I) . (|I.52I) and (11.551) . yields: 



I2 = Po [In (po) -l]+Po 



a — 1 



\n{l+P) + 2F1 1,1 + -,2 + 



1 1 



a — 1 a — 1 P 



(1.53) 



(1.54) 



(1.55) 



(1.56) 



The integral I4 may be computed similarly. We skip the details and give below the final result: 



h =Pi In (po) -pia + pi 



ln(l + /3) + (a-l)-2Fi 1, 



1 



-. 1 



1 



a — 1 ' ' a — 1' (3 

Following (ir48l) . (lL49l) . (lOOl) . (lL56l) . (OTl) and using the fact that: 

1 \ 1 - Pi / 1 1 1 
^ll.^Fi 1,1 + -,2 + 



2Fi( 1,^^,1+ ^ 



a — 1 
we obtain: 

hB{xi,Pi) 



a-V (3 



Pi 



a — 1 ' 



a-r (3 



- In (1 - pi) + (a;? - In (1 + x^)) - In (1 + 13) 
pi{a - 1) 



a 



(«-l)-2Fi ( 1,^^,1+ ^ 



a — 1 ' 



a - r (3 



Combining (11.591) and (|I.46|) yields (fTTI) which completes the proof of Lemma [T] 



(1.57) 



(1.58) 



(1.59) 



Appendix II 
Proof of theorem [H 

At low SNR, a discrete input distribution with two mass points, one of them located at zero, 
achieves the non-coherent capacity [6]. That pi = a/x\ was proven in Appendix HI Therefore, 
(fT2)) is true. To derive (fT3l) . it is a matter of series expansion calculus. 

Before proceeding, it should be reminded that for the optimal input distribution given in 
Theorem [U the non-zero mass point location xi is greater than 1 {xi > 1) [6], [13]. Then, 
series expansion of (fTTI) to the second order, around the point (xi, a) = (xi, 0), where Xi is an 
arbitrary real greater than one, can be obtained using Mathematica: 

1 



I LB 



[xi,a) 



log(l+x|) 
(1 -2 )a + 



xl 



-a^ ( TTX^ ( x^(l + x\' 



2{xl - 1) 



l + x 



CSC 



IT 



7r( x\{l + 



1 + 3^1 



CSC 



x\ 



(11.60) 



where the symbol o(a") represents a function say g{xi, a), such that lim ^*-^^'"^ = 0. Since xi > 1, 



then there exists e > such that 1 + 7^ < 2 — e. Thus, (11.401) may be written as: 

'■"1 

f log(l + a;2)\ 2/^2/1, 
II 2 — J " ^^1 ( + ^1) 



lLB{xi,a) 



4 

T / TT \ 1 + 4 

CSC — ^ a ""i + o a 



TT 



, (11.61) 



which represents series expansion to an order strictly less than 2. Up to this order, we may make 
some abuse of notation, drop the term o(a^^^) and write (|IL61I) as: 



lLB{xi,a) = \1 2 \a - 7Tx^\xi{l + Xi) \ cscl^la . (11.62) 

Maximizing (11.421) with respect to a;i > 1 is equivalent to: 



mm 

X1>1 



log(l + a;?) 2f 2(, ^ 2.\ ^ 1+^ 
+ Tcx-i^l Xi[l + Xi) ] CSC ( — la "=1 



OC -\ \ / \ ^ 



(11.63) 



As was proven in Appendix HI at the maximum, we have necessarily ^iLsi^i, a) = 0. 
Differentiating (III.63I) with respect to Xi yields (fT3l) . Finally, (fT4l) follows from (|II.62|) . This 
completes the proof of Theorem [TJ 

Appendix III 
Proof of theorem [2] 

For a < aQ and Xi > xq, (fT6l) may be written as: 

a{xi) = exp [xlW{-l,ip{xi)) - x\ + vrcot (^) + \Yy{x\) + In (1 + x^) - l]. (III.64) 

^1 

Moreover, it is easy to check that a in (|III.64I) is a decreasing function with respect to X\ and 
that: 

- + vr cot (^) + In (x^) + In (1 + x^) - 1 > 1, (III.65) 

^1 

for xi > Xq. Thus, using (|III.64I) and (IIII.65I) . we have: 

a(a;i) > a/i,(xi) = exp {x\W{-\, (^(xi)) + l] , (III.66) 

where aif,(xi) is a lower bound on a{x\). Since ai\){x\) is also a decreasing function with respect 
to xi, then for a low SNR value a, (|III.66I) may be seen as a lower bound on the optimal non-zero 
mass point location xi and we equivalently have: 

xi > xi,/6, (ni.67) 

where xi^/t is the solution of a/fe(xi) = a. Next, we derive a lower bound on xi,;;,. 

Let us fixe a low SNR value a < and consider the function on the RHS of (IIII.66I) written 
for simplicity as: 

a = exp [xli,W{-l,if{xi,ib)) + l], (III.68) 



or equivalently by letting y = a / l + In (i) : 



-W[-l,ip{xi^ii,)) 

Since —W{—l,Lp{xi^ib)) > 1 for xi^ib > xq, it is easy to see that y'^ > xl^i^. Hence, using the 
fact that ip{-) and —W[—l, ■) are strictly increasing functions, we have: 

XiIb = I ^ < ^i,ib = I ^ (III.70) 

^-W{-l,v{y)) ^-w{-l,^{x,^i,)) 

where the superscript {}) on the left hand side of (IIIL70I ) means a first lower bound. Next we 
improve the lower bound x'f^Q to obtain a tighter one. But before going on, we remind this 
result from [16] which aims at resolving transcendental equations involving Lambert function 
iteratively using self-mapping techniques: 

Lemma 3: For the region specified by x < 1 and — ^ < y < 0, an infinite-ladder solution to 
the equation: 

y(x) = xe^ (III.71) 

is easily identified as 

x{y)=L4y), (ni.72) 

with the ladder L^{y) defined as 

(In IilLiIX 
— ^ ) • ("^-^3) 
Proof: The proof and more details concerning the Lambert function can be found in [16]. 

■ 

Clearly, using (IIII.73I) and the fact that the solution of (|ni.71l) is also x{y) = W{—1, y), one can 
obtain a simple upper bound on the Lambert function in the interval of interest: 

W{-1, y) < In {-y) - In (- In {-y)) . (IIL74) 

Since for xi^u, > Xq, (p{xi^ib) e] — ^,0[ and W{—1, (p{xi^ib)) < 0, then applying (lin.741) to 
^{xi^ib) yields: 

Wi-l,^ix,,ib)) < ln( -^^y^ ) (IIL75) 

-In [-(p{xi^ib)) 

-In (-<^(xi,,b)) 
< \n{-ip{y)). (IIL77) 



Inequality (|IIL76I) holds because y > xi^ih and '^{■) is an increasing function, likewise (IIIL77I) 

follows from the fact that for x > xq, ip{x) > —\ and thus ^ < 1- Moreover, (IIII.77I) 

implies 

^ > / , ^ = xi (III.78) 



Applying again respectively Lp{-) and — 1, ■) to both sides of (IIII.78I) gives: 

V V-ln(-<^(s,))/y 

Finally, to prove that x^^^^ is tighter than it is sufficient to note that since •^[x\^i\^ 

y > si if, and is an increasing function, then g] — ^,0[ and we have consequently: 

y > Y- Applying again respectively ip(-) and —W{—1,-) to this inequality yields: 

= , " < . . " = (111.80) 



Combining (IIII.79I) and (IIII.80L we have: 

a^Si? < ^S'L < 3^1,^6. (111.81) 

from which (l38l) follows by letting xf)^^ = Finally, (|39|) may be obtained by applying 

(fT4)) to xi^LB- This completes the proof of Theorem |2l 

References 

[1] Sergio Verdu, "Spectral Efficiency in the Wideband Regime," IEEE Trans, on Information Theory, vol. 48, no. 6, pp. 1319- 
1343, June 2002. 

[2] Muriel Medard, "The Effect upon Channel Capacity in Wireless Communications of Perfect and Imperfect Knowledge of 

the Channel," IEEE Trans, on Information Theory, vol. 46, no. 3, pp. 933-946, May 2000. 
[3] Ericson T., "A Gaussian Channel with Slow fading," IEEE Trans, on Information Theory, vol. 16, pp. 353-356, 1970. 
[4] G. Foschini, "Layered space time architecture for wireless communication in a fading environment when using multi-element 

antennas," Bell Systems Technical Journal, vol. 1, pp. 41-59, Autumn 1996. 
[5] I. E. Telatar, "Capacity of multi-antenna gaussian channels," Europeen Trans. On Commimication, vol. 10, no. 6, pp. 

585-5595, Nov. 1999. 

[6] Ibrahim C. Abou-Faycal, Mitchell D. Trott and Shlomo Shamai(Shitz), "The Capacity of Discrete-Time memoryless 
Rayleigh-Fading Channels," IEEE Trans, on Information Theory, vol. 47, no. 4, pp. I290-130I, May 2001. 

[7] R. R. Perera, K. Nguyen, T.S. Pollock; and T.D. Abhayapala, "Capacity of non-coherent Rayleigh fading MIMO channels." 
Communications, lEE Proceedings-, Vol.153, Iss.6, Dec. 2006 Pages:976-983 



[8] T. L. Marzetta and B. M. Hochwald, "Capacity of a mobile multiple-antenna communication link in Rayleigh flat fading ," 

IEEE Trans, on Information Theory, vol. 45, no. 1, pp. 139-157, Jan. 1999. 
[9] Lizhong Zheng and David N. C. Tse "Communication on the Grassmann Manifold: A Geometric Approach to the 

Noncoherent multiple-antenna channel," IEEE Trans, on Information Theory, vol. 48, no. 2, pp. 359-383, Feb. 2002. 
[10] R. S. Kennedy, Fading Dispersive Communication Channels, New York: Wiley, 1969. 

[11] I. E. Telatar and D. Tse "Capacity and Mutual Information of Wideband Multiplath Fading Channels," IEEE Trans, on 

Information Theory, vol. 46, no. 4, pp. 1384-1400, My 2000. 
[12] R. G. Gallager, Information Theory and Reliable Communication, New York: Wiley, 1968. 

[13] Lizhong Zheng, David N. C. Tse and Muriel Medard "Channel Coherence in the Low-SNR Regime," IEEE Trans, on 

Information Theory, vol. 53, no. 3, pp. 976-997, March 2007. 
[14] Siddharth Ray, Muriel Medard and Lizhong Zheng "On NONcoherent MIMO Channels in the Wideband Regime: Capacity 

and Reliability," IEEE Trans, on Information Theory, vol. 53, no. 6, pp. 1983-2009, June 2007. 
[15] L S. Gradshteyn and L M. Ryzhik, Table of Integrals, Series, and Products, A. Jeffrey, Ed. Academic Press, inc, 1980. 
[16] Galen Pickettl and Yonko MiUev, "On the analytic inversion of functions, solution of transcendental equations and infinite 

self-mappings," JOURNAL OF PHYSICS A: MATHEMATICAL AND GENERAL,yo\. 35, pp. 44854494, 2002. 




Non zero Mass point location Non zero Mass point location Non zero Mass point location 



(a) Very Low SNR (b) Low SNR (c) High SNR 



Fig. L Channel mutual information lower bound versus non-zero mass point for 3 SNR regimes: a) Very Low SNR, b) Low 
SNR and c) High SNR 




Fig. 2. Location of non-zero mass point versus the SNR value a (linear). 




Fig. 3. Non-coherent capacity versus the SNR value a (linear). 




Fig. 4. Non-coherentce penalty per SNR versus the SNR value a (linear). 




Fig. 5. Exact non-zero mass point locations and the derived upper and lower bounds as well as those reported in [13] versus 
the SNR value a (linear). 



